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PREFACE 


Connectionist  approaches  to  cognitive  performance  are  generally  based  on  traditional 
network  or  associative  frameworks  in  which  learning  plays  a  pivotal  role.  Studies  of  learning 
within  these  frameworks  have  long  been  a  staple  of  those  areas  of  experimental  psychology 
concerned  with  cognitive  performance  and  its  ontology.  Kamin’s  rediscovery  of  blocking 
of  classical  (Pavlovian)  conditioning  in  rats  twenty  years  ago  triggered  a  movement  within 
animal  learning  psychology  that  stresses  the  informational  and  cognitive  aspects  of  tasks 
such  as  conditioning,  discrimination  learning,  and  stimulus  generalization.  Theoretical 
models  that  have  grown  up  around  these  newer  approaches  have  been  extended  to  a  variety 
of  problems  of  computation  and  process  control.  Some  of  these  extensions  and  applications 
have  been  described  in  recently  published  proceedings  of  the  Cognitive  Science  Society  and 
in  papers  generated  by  Professor  A.  G.  Barto’s  Adaptive  Network  Group. 

Many  network  learning  algorithms  conform  to  the  delta  rule.  One  important  member 
of  this  class  of  algorithms  is  the  Widrow-Hoff  nde  which  Barto  and  his  associates  have 
shown  to  be  closely  related  to  an  influential  theory  in  psychology,  the  Rescorla- Wagner 
(RW)  model.  The  RW  model  was  devised  as  a  mechanistic  account  of  blocking  and  other 
multiple-cue  protocols  in  the  experimental  literature  on  classical  conditioning.  The  l?W 
model  and  two  theories  described  in  this  report  can  also  be  applied  to  a  problems  of 
interest  to  adaptive  network  researchers.  One  such  problem  area  is  that  of  credit  assign¬ 
ment.  Multiple-cue  training  protocols  in  classical  conditioning  have  been  cast  in  these 
terms,  as  Barto  and  Richard  Sutton  have  illustraterl  in  a  series  of  reports.  It  has  lately 
become  commonplace  to  assess  network  learning  algorithms  by  their  ability  to  emulate  the 
phenomenology  of  classical  conditioning. 

Useful  as  the  RW  model  has  proven  to  be,  some  of  its  competitors  in  the  animal  learning 
literature  seem  equally  capable,  if  not  more  so.  This  report  describes  simulation  studies  of 
two  of  these  alternatives  to  the  RW  model  within  the  framework  of  cliLssical  conditioning. 
The  results  of  these  simulation  studies  suggest  alternative  computational  forms  of  both 
models.  These  revised  models  have  been  successfully  applied  to  neural  network  theories  of 
hippocampal  function  and  the  formation  of  spatio-temporal  cognitive  maps. 
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Abstract 

Attentional  models  offer  alternative  to  the  highly  successful  theory  of  Rescorla 
and  Wagner  (lOTS^’for  describing  blocking,  overshadowing,  and  many  other  features 
of  classical  conditioning.  Two  such  models  ^are  the  Moore  and  Stickney  (1980)  ver¬ 
sion  of  Mackintosh^  (T97^  attention  theoiy  and  the  Pearce  and  Hail  (1980)  model. 
These  modei^mphasize  variations  in  the  associability  of  CSs  instead  of  variation 
in  the  effectiveness  of  the  reinforcing  event,  the  US.  Early  published  yariwt^  of  the 
Moore-Stickney  and  Pearce- Hall  model^o  not  always  accurately  portray  the  effects  of 
nonreinforced  CS  presentations  as  represented  in  simulation  experiments.  In  th^case 
of  the  Moore-Stickney  modelrievels  of  conditioned  responding  under  partial  reinforce¬ 
ment  are  too  low  to  reasonably  approximate  expectations  based  on  the  experimental 
literature,  and  extinction  is  too  deep  to  produce  the  rapid  reacquisition  that  typically 
follows  extinction.  These  problems  are  corrected  by  changing  the  expressions  in  the 
model  for  decreasing  associative  strength.  The  revised  model  retains  the  positive  fea¬ 
tures  of  the  original,  e.g.,  the  ability  to  simulate  in  real-time  latent  inhibition  and 
compound  CS  effects  such  as  blocking  and  conditioned  inhibition.  The  P-4^model  is 
path  dependent  and  highly  nonlinear  under  partial  reinforcement.  The  problem  can  be 
corrected  either  by  modifying  and  restricting  the  rules  for  computing  the  associability 
of  the  CS,  or  by  modifying  the  rules  for  computing  associative  strength.  The  revised 
model  retains  the  original's  ability  to  simulate  latent  inhibition,  compound  CS  effects, 
and  the  transfer  (positive  or  negative)  from  training  with  a  weak  US  to  training  with 
stronger  US.  '  ,-t  '  ,  .  i  ' 


Introduction 

This  article  reviews  two  mathematical  models  of  classical  conditioning  that  stress  atten¬ 
tional  processes,  the  Moore-Stickney  (M-S)  model  (Moore  k  Stickney,  1980,  1982,  1985) 
and  the  Pearce-Hall  (P-H)  model  (Pearce  It  Hall,  1980).  These  models  feature  mechanisms 
for  altering  the  contribution  of  the  CS  to  the  rate  of  change  of  the  associative  relation¬ 
ship  between  a  conditioned  stimulus  (CS)  and  the  unconditioned  stimulus  (US),  and  they 
have  been  dubbed  “attentional”  models  by  their  originators.  Thus,  instead  of  citing  varia¬ 
tions  in  the  effectiveness  of  the  US  to  account  for  the  phenomena  of  conditioning  as  does, 


e.g.,  the  Rescorla  Wagner  (R  W)  model,  these  models  emphasize  processes  that  cause 
the  associability  of  the  C’S  to  vary.  In  assessing  the  M  S  and  P  H  models  in  simulation 
experiments,  we  discovered  some  serious  shortcomings  in  published  versions  of  each.  In 
this  article  we  review  the  mathematical  statement  of  both  models  in  detail,  shortcomings 
are  indicated,  and  suggested  modifications  noted.  These  modifications  are  designed  to 
enhance  further  development  of  each  model  into  domains  where  computational  versions  of 
^rtentional  theories  of  learning  might  find  application,  including  neuroscience  and  artificial 
intelligence. 

Following  Rescorla  and  Wagner  (1972),  the  symbol  V  (for  associative  value)  is  used 
throughout  to  denote  the  primary  theoretical  dependent  variable  representing  the  strength 
and  sign  of  the  associative  relationship  between  a  CS  and  the  US.  Linear  difference  equa¬ 
tions  express  how  V''  changes  from  one  trial  to  the  next.  Although  the  resulting  associative 
structures  at  the  representation  level  are  not  isomorphic  with  performance  measures  such 
as  the  probability  of  a  conditioned  response  (CR),  the  mapping  from  V  to  behavioral 
indices  of  learning  are  at  least  monotonic  in  most  applications  (see  Frey  k  Sears,  1978, 
for  an  extended  treatment  of  this  issue  in  relation  to  classical  conditioning  of  the  rabbit 
eyeblink). 

Assumptions  and  formal  structure  of  mathematical  models  of  classical  conditioning 
in  the  contemporary  theoretical  literature  basically  revolve  around  two  questions.  One 
question  is  whether  the  strength  of  associative  links  to  the  US  among  various  components 
of  a  compound  CS  must  be  partitioned  or  shared  (the  “zero-sum”  rule),  as  in  the  case  of 
the  R-W  model,  or  whether  a  given  component  CS  can  in  principle  develop  a  complete 
associative  link  to  the  US  despite  the  presence  of  competitors.  Mackintosh  (1975)  has 
provided  a  lucid  discussion  of  this  point,  and  the  distinction  finds  a  parallel  in  the  field  of 
artificial  or  machine  learning  where  the  “zero-sum  rule”  is  a  feature  of  ADALINES  and 
noncompeting  associations  is  a  feature  of  perceptrons  (see  Duda  k  Hart,  1973;  Sutton  k 
Barto,  1981). 

The  second  question  concerns  the  relationship  between  the  rate  of  learning  or,  equiv¬ 
alently,  the  magnitude  of  the  change  in  associative  value  on  a  given  trial  by  a  given  CS, 
and  the  previous  history  of  reinforcement  of  the  CS.  This  question  reduces  to  one  of  de¬ 
ciding  whether  the  parameter  of  a  given  model  that  determines  changes  in  V  of  the  CS 
remains  invariant  over  training  trials,  as  in  the  R  W  model,  or  whether  CS  associability  is 
permitted  to  change  from  one  trial  to  the  next.  Some  models,  such  as  Mackintosh’s  (1975) 
attention  theory  and  the  M  S  model,  which  do  not  feature  a  zero-sum  rule,  and  Frey  and 
Sear’s  (1978)  “catastrophe”  model,  which  does,  assume  that  reinforcement  momentarily 
increases  the  salience  or  associability  of  the  CS  and  thereby  its  contribution  to  the  rate  of 
learning.  Other  models  assume  that  the  salience  or  associability  of  a  CS  decreases  because 
of  “reduced  processing”,  as  in  Wagner’s  (1976)  habituation  theory  and  the  P-H  model. 

Our  interest  in  the  M  S  and  P  H  models  arose  from  efforts  to  nurture  a  link  between 


learning  theory  and  the  literature  from  neurobiology  on  the  role  of  the  hippocampus  in 
learning  and  memory,  particularly  classical  conditioning.  The  rationale  for  chosing  at- 
tentional  models  for  developing  this  linkage  has  been  enunciated  by  the  authors  in  other 
articles  (Moore,  1979;  Schmajuk,  1984;  Schmajuk  k  Moore,  1985).  We  do  not  claim  that 
other  models  cannot  also  provide  the  basis  of  a  rigorous  theoretical  approach  to  the  neuro- 
biology  of  learning  in  its  various  forms.  We  simply  do  not  feel  qualified  to  proffer  comment 
on  all  extant  learning  models.  Any  benefits  in  enhanced  understanding  of  the  hippocampus 
or  the  nature  of  classical  conditioning  accruing  from  these  efforts  must  ultimately  rest  on 
how  well  our  models  perform  in  describing  even  the  most  mundane  features  of  this  type  of 
learning. 


Overview 

Using  Mackintosh’s  (1975)  attention  theory  as  a  scaffold,  the  M-S  model  uses  a  simple 
linear  difference  equation  to  express  associative  increments  from  one  trial  to  the  next 
during  a  simple  acquisition  protocol;  When  a  trial  is  defined  as  the  paired  occurrence  of 
the  CS  and  US, 

AV=a9{l-V)  (1) 

Q  is  the  rate  parameter  contributed  by  the  CS,  9  is  the  rate  parameter  contributed  by 
the  US,  and  a  and  9  are  between  0  and  1.  It  has  become  customary  to  refer  to  a  as 
the  associability  of  the  CS.  The  asymptotic  level  of  learning  equals  1  rather  than  the 
US-intensity  dependent  parameter  A  in  Mackintosh’s  original  paper. 

Our  reason  for  restricting  the  asymptotic  level  of  associative  strength  to  1  is  that  we 
interpret  V  to  represent  the  strength  of  the  organism’s  “belief”  that  the  US  follows  the 
CS.  Anthropomorphically,  an  organism  can  be  no  more  than  100  per  cent  certain  that  a 
CS  will  be  followed  by  a  given  US.  The  degree  of  this  conviction,  which  might  be  likened 
to  wagering,  is  independent  of  the  intensity  of  either  event.  In  short,  we  interpret  V  to 
be  an  index  of  belief,  prediction,  or  inferred  causality,  and  that,  as  such,  it  represents 
the  reliability  of  an  internodal  link  within  an  associative  network.  Moore  and  Stickney 
(1985,  p.  228)  discuss  some  of  the  consequences  and  limitations  of  this  conceptualization 
regarding  the  asymptote  of  learning.  The  issue  is  largely  irrelevant  for  present  purposes 
except  insofar  as  it  bears  on  the  number  of  degrees  of  freedom  available  for  describing 
phenomena. 

Whereas  9  is  treated  as  a  constant  within  a  given  application  of  Ex]uation  1,  a  can 
vary  from  trial  to  trial.  This  variation  in  a  accounts  for  most,  if  not  all,  of  the  phenomena 
that  prompted  development  of  the  R  W  model,  e.g.,  blocking  and  overshadowing.  The 
parameter  n  also  appears  in  the  R-W  model,  but  it  is  typically  treated  as  a  constant 
because  the  mechanism  that  predicts  these  phenomena  resides  in  the  zero-sum  rule  (except 
see  Wagner,  1978). 
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The  P-H  model  expresses  the  increments  in  associative  value  accompanying  acquisition 
by  the  expression 

AV  =SaX  (2) 

As  in  the  M-S  model,  a  varies  according  to  rules  that  are  sufficient  in  themselves  to  predict 
blocking  and  overshadowing,  as  well  as  simple  acquisition  and  other  phenomena. 

Although  both  models  share  the  R-W  model’s  ability  to  predict  many  of  the  better 
established  and  more  highly  visible  phenomena  of  classical  conditioning,  the  rules  for 
computing  a  in  the  two  models  differ  profoundly.  Mackintosh  (1975)  and  M-S  assume  that 
a  for  a  CS  increases  from  one  reinforced  trial  to  the  next,  provided  it  is  the  best  predictor 
of  the  US  (largest  V)  among  all  stimuli,  including  CSs  and  the  context,  occurring  at  the 
same  time.  On  these  occasions,  os  for  stimuli  with  smaller  V s  decrease,  and  therefore  their 
capacities  to  further  strengthen  associative  links  to  the  US  are  diminished.  Thus,  changes 
in  as  for  a  set  of  CSs  depend  on  their  relative  Vs  with  respect  to  the  US.  By  contrast,  the 
P-H  model  assumes  that  a  for  a  CS  that  is  consistently  paired  with  a  US  decreases  over 
trials.  Despite  this  difference  in  assumptions  regarding  variations  in  o,  the  two  models 
give  qualitatively  similar  predictions  in  many  protocols. 

The  two  models  are  similar  in  another  respect;  Both  the  M-S  and  P-H  models  assume 
inhibitory  learning  in  parallel  with  acquisition  of  V.  It  is  therefore  possible  for  a  given  CS 
to  po.ssess  both  excitatory  and  inhibitory  associations  concurrently,  the  two  associations 
summating  algebraically  to  determine  CR  strength.  By  contrast,  the  R-W  model  assumes 
that  a  CS  can  possess  either  an  excitatory  or  an  inhibitory  association  with  the  US,  but 
not  both  simultaneously.  Moore  and  Stickney  (1982,  1985)  refer  to  “negative”  learning 
as  the  acquisition  of  an  antiassociation;  Pearce  and  Hall  (1980)  use  the  term  inhibitory 
conditioning  We  use  the  symbol  N  to  denote  the  inhibitory  variable  in  the  two  models. 

Shortcomings  in  Brief 

Although  the  M-S  model  successfully  describes  many  facets  of  conditioned  and  latent 
inhibition,  it  fails  to  predict  realistic  scenarios  under  partial  reinforcement,  extinction,  and 
related  protocols  involving  nonreinforced  trials.  Our  proposed  solution  involves  changing 
the  expressions  for  computing  decreases  of  V  and  increases  of  N  that  are  applied  on 
nonreinforced  trials.  The  P-H  model  has  problems  describing  partial  reinforcement,  and 
these  have  been  discussed  by  its  originators  (Pearce,  Kaye  k  Hall,  1982).  The  model 
predicts  nonmonotonic  relationships  between  the  percentage  of  trials  that  are  reinforced 
and  the  level  of  conditioning  achieved.  Another  shortcoming  of  the  P-H  model  is  that 
terminal  values  of  associative  strength  are  highly  dependent  on  the  length  of  sequences  of 
reinforced  and  nonreinforced  trials.  One  proposed  solution  involves  alternative  rules  for 
computing  a.  Another  solution  involves  changes  in  rules  for  computing  V  and  N. 
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Moore  and  Stickney  Model 
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Moore  and  Stickney  (1980)  developed  their  model  originally  in  order  to  place  Mackin¬ 
tosh’s  (1975)  theory  on  a  firm  computational  footing.  Summarizing  the  model  as  described 
in  a  recent  chapter  (Moore  k  Stickney,  1985):  (a)  Associative  value,  V,  represents  the  pre¬ 
diction  of  the  US  by  a  CS.  (b)  Antiassociative  value,  N,  represents  the  prediction  of 
nonreinforcement  by  a  CS.  (c)  The  strength  of  a  CR  to  a  given  CS  depends  on  its  net 
associative  value,  V,  given  by  U  —  A^.  (d)  V  for  a  given  CS  increases  when  it  accurately 
predicts  the  US  and  decreases  otherwise,  (e)  By  contrast,  N  increases  when  the  sum  of 
V's  for  all  CSs  present  is  above  some  threshold,  and  the  US  does  not  occur.  N  decreases 
whenever  the  US  does  occur,  (f)  Changes  of  V  and  N  depend  on  the  associability,  a,  of 
the  CS.  (g)  a  increases  to  the  extent  that  the  CS  is  the  best  predictor  of  the  US  than  other 
stimuli  (including  itself)  in  the  situation.  Otherwise,  it  decreases,  (h)  This  dependence  of 
a  on  the  predictive  associative  relationships  among  all  stimuli  in  the  situation  implies  the 
existence  of  a  network  of  Us  and  JVs.  (i)  The  model  applies  to  real  time,  implying  that 
computations  occur  continuously,  both  within  and  between  trials. 

In  the  following  formal  statement  of  the  M-S  model,  subscripts  are  used  to  specify 
stimuli  in  the  role  of  predictor.  Superscripts  denote  the  target  of  the  prediction.  Thus,  if 
A  and  B  are  two  stimuli  such  as  a  CS  and  US,  then  designates  the  associative  value  of 
A  predicts  B.  designates  the  antiassociation  A  predicts  not  B,  and  (V^  -  N^)  = 
is  the  net  value  of  the  relationship.  It  is  important  to  note  that  does  not  equal  V^. 

Generalizing  the  notation,  when  the  ith  CS  is  accompanied  or  followed  by  a  kth  event, 
the  associative  value  between  CS,-  and  event  k,  V/',  is  increased  by 

AU,*  =  o.0t(1 -U..*)  (3) 

When  event  k  does  not  occur,  the  associative  value  between  CS,  and  the  event  k,  V,*, 
is  decreased  by 

AU,.*  =  o,-0't(O-U,.‘)  (4) 

The  parameter  o,-  is  the  associability  of  the  ith  CS;  it  ranges  between  0  and  1.  The 
parameter  0  (Q  <  6  <  1)  is  the  rate  of  change  in  the  association  when  the  reinforcer  is 
presented,  and  (0  <  ^'  <  ^  )  is  the  rate  of  change  of  the  association  when  the  reinforcer 
is  not  presented. 

The  parameter  r  is  a  function  of  time  such  that 

T  =  (.5) 

where  (/  is  a  constant  equal  to  the  optimal  interval  for  a.s.sociation,  A/  >  0  is  the  interval 
between  the  ith  CS  and  the  kth  event  in  q  steps,  and  k  is  a  constant  (0  <  /:  <  1)  In  our 


simulations,  r  with  dit  <  q  was  set  equal  to  .067.  (Refer  to  Schmajuk  k  Moore,  1985  for  a 
fuller  explanation  of  implementation). 

The  rule  for  increasing  the  antiassociation  between  CSj  and  event  A:  is  as  follows: 
whenever  CS,-  is  neither  accompanied  nor  followed  by  the  event  k  and  the  sum  of  the 
associative  values  of  all  CSs  present,  53  V/,  exceeds  an  arbitrary  and  constant  threshold 
L,  antiassociative  value,  A/,*,  is  increased  by 

=  aie'T{l- N,^)  (6) 

whenever  ^  decreasing  iV*  is  as  follows:  whenever  the  A:th  event 

i 

follows  the  CS,-  the  antiassociative  value  decreases  by: 

=  QiB  r{Q  -  N^)  (7) 

The  net  associative  value  of  CS,-  with  respect  to  event  k  is  V,*  =  V,-*  —  N^. 

In  the  M-S  model,  associability  of  a  CS  depends  on  associative  processes.  The  asso- 
ciability  of  CS,-,  a,-,  may  increase,  decrease,  or  remain  unchanged  according  to  a  weighted 
combination  of  event-specific  components,  A  a}.  These  event-specific  components  are  com¬ 
puted  based  on  the  relationship  between  the  associative  value  of  CS,-  and  the  event  k  and 
the  associative  value  of  another  CS,  CSy,  with  the  same  event  k.  Whenever  CS,  and  CSy 
are  present  together  with  the  kih  event  and  provided  V/  >  V/, 

Aa;=c(l-o;)(V>-l^*)  (8) 

V/  always  corresponds  to  the  second  highest  associative  value  with  respect  to  event  k  of 
all  stimuli  present  with  CS,,  including  the  context  and/or  the  US. 

If  V.*  <  V/, 

Ao;=r(0-o*)(K/-V;*)  (9) 

where  V/  is  the  highest  associative  value  with  respect  to  k  of  all  stimuli  present  with  CS,-. 
The  parameter  c  in  Equations  8  and  9  isset  0  <  c  <  I.  Once  the  event-specific  components 
of  a,  have  been  computed,  they  are  combined  in  the  expression 

Aa.  =  53 

>  * 

The  weight  assigned  to  each  A  is  indicated  by  the  constant  4>j.  The  sum  over  the  index 
j  in  the  numerator  of  Exjuation  10  involves  all  the  events  present  on  that  trial  or  time  step. 
The  sum  over  the  index  k  in  the  denominator  is  over  all  the  events  or  stimuli  the  subject 
has  encountered  in  the  context,  even  though  some  of  these  may  not  be  present  at  the  time 
that  Aa,-  is  computed.  Thus,  the  numerator  of  E^quation  10  involves  associations  among 
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stimuli  that  are  present  at  the  moment  of  computation,  whereas  the  denominator  involves 
the  weights  of  these  stimuli  plus  those  encountered  previously.  The  US  is  presumed  to  be 
represented  in  memory  more  strongly  than  are  CSs,  which  are  in  turn  typically  weighted 
more  heavily  than  the  context. 

Partial  Reinforcement  and  Extinction 

Shortcomings  of  the  M-S  model  came  to  light  in  simulations  of  certain  protocols  involv¬ 
ing  nonreinforcement;  partial  reinforcement,  extinction,  and  simple  differential  condition¬ 
ing.  Although  it  is  unreasonable  to  require  that  any  model  be  universally  applicable  to  all 
tasks  and  circumstances,  the  current  versions  of  both  models  are  so  widely  at  variance  w  ith 
the  experimental  literature  that  corrective  measures  seemed  called  for.  This  is  particularly 
the  case  regarding  partial  reinforcement  during  acquisition.  The  experimental  literature 
suggests  that  partial  reinforcement  in  classical  conditioning  results  either  in  a  lower  level 
of  CR  strength  (e.g.,  CR  frequency)  than  that  obtained  under  100%  reinforcement  or  else 
a  level  that  is  just  as  high  as  in  the  100%  case.  Once  acquired,  a  CR  can  be  maintained  at 
close  to  full  strength  with  schedules  of  reinforcement  as  lean  as  5%  (Gormezano,  Kehoe, 
it  Marshall,  1983).  Gormezano  and  Moore  (1969)  tabulated  7  of  15  studies  across  a  range 
of  species  and  preparations  in  which  50%  partial  reinforcement  resulted  in  levels  of  CR 
strength  following  acquisition  that  were  significantly  below  those  observed  under  100% 
reinforcement.  No  difference  was  noted  in  the  remaining  8  studies.  This  much  seems  clear: 
Levels  of  conditioned  responding  under  partial  reinforcement  ought  neither  to  exceed  those 
under  100%  reinforcement,  appetitive  instrumental  conditioning  tasks  being  a  well  known 
exception  (Kimble,  1961),  nor  be  so  low  as  to  portend  imminent  extinction. 

Simulations  with  the  M-S  Mode! 

The  following  simulation  experiments  all  assumed  the  U  and  M  for  both  the  (\S  and 
the  context  have  initial  values  of  0  prior  to  any  training.  The  initial  value  of  a  for  the  C.S 
was  .5;  that  of  the  context  was  .1.  The  parameter  c  in  fxjuations  9  and  10  was  3,  and  0 
and  0'  in  Equations  3  and  4  were  .1  and  .01,  respe<‘ively  Following  Moore  and  Stickne\ 
(1980),  the  weights  and  <f>t,  in  Fxjuation  10  were  1.0  for  the  US,  16  for  CSs,  and  ()1  for 
the  context.  Figure  1  illustrates  the  M  S  applied  to  50%  partial  reinforcement  It  shows 
that  acquisition  of  U  as  a  function  of  trials  with  a  50%i  reinforcement  protixol  initially 
increases  and  then  decreases  to  a  stable  level  well  below  that  predicted  with  the  same 
parameter  set  for  10(J%  reinforcement.  A  portrayal  more  in  keeping  with  the  liti  rature 
would  show  V  under  50%  reinforement  increasing  uniformly  and  leveling  off  to  .i  point 
just  below  that  obtained  with  100%  reinforcement  As  Fig  1  makes  clear,  the  iiroblcni 
arises  from  unrestrained  development  of  N  once  the  threshold  for  triggering  l>juation  (> 
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^V.*  =  O.  e’rH)  -  V.*)(^  v;  -  V  A/)  (12) 

Figures  2-4  show  that  the  new  expression  for  decreasing  V'  and  increasing  A  allow  the 
M  S  model  to  riK-re  accurately  descril>e  partial  reinforcement  (Fig.  2),  extinction  (Fig 
d)  and  rea“]uisition  (Fig  4).  In  the  case  of  partial  reinforcement,  Fig.  2  shows  that  the 
resist’d  model  allows  normal  appearing  monotonic  acquisition  of  V'  to  a  level  below  that 
obtained  under  continous  reinforcement. 

Figure  .A  shows  that  V'  in  extinction  can  go  below  zero,  but  not  so  dramatically  as 
bi'fore  The  change  also  implies  that  reacquisition  would  normally  be  more  rapid  than 
original  acquisition  (see  Fig.  -4),  in  agreement  with  some  of  the  literature  (Scavio,  Ross, 
k  McLeod,  I98d)  In  the  case  of  the  rabbit  nictitating  membrane  (NM)  response,  rapid 


reacquisition  may  not  be  specific  to  the  target  CS  because  of  a  general  transfer  process  in 
which  initial  acquisition  to  one  CS  promotes  rapid  acquisition  to  another,  quite  distinct 
CS  (Kehoe,  Morrow,  It  Holt,  1984). 

It  should  be  noted  that  neither  the  older  or  here-modified  version  of  the  M-S  model 
generates  spontaneous  recovery.  This  criticism  applies  to  all  contemporary  models,  in¬ 
cluding  P-H  and  R-W.  Exploring  the  full  implications  of  this  modification  of  the  M-S 
model  lies  beyond  the  scope  of  this  article.  However,  simulation  experiments  indicate  that 
the  descriptive  power  of  the  model  remains  largely  intact  when  Equations  II  and  12  are 
employed  instead  of  Exjuations  4  and  6  in  protocob  involving  nonreinforced  triab,  most 
notably  conditioned  and  latent  inhibition. 

Pearce  and  Hall  Model 

Pearce  and  Hall  (1980)  proposed  their  model  as  an  alternative  to  Mackintosh-type 
attention  theories  because  of  their  discovery  that  a  series  of  acquisition  triab  with  a  weak 
shock  US  in  a  conditioned  suppression  task  can  retard  subsequent  acquisition  using  a 
stronger  shock  US  (Hall  k  Pearce,  1979),  a  phenomenon  they  liken  to  latent  inhibition 
(LI).  Negative  transfer  (NT)  due  to  initial  training  with  a  weak  US  falb  naturally  out 
of  the  P-H  model  by  virtue  of  the  assumption  that  a  CS’s  associability  decreases  with 
repeated  pairings  with  a  US.  Although  NT  from  a  weak  to  a  strong  US  in  conditioned 
suppression  has  been  replicated  by  others,  it  is  not  always  obtained  in  such  studies;  nor  is 
there  evidence  for  the  effect  in  the  rabbit  NM  response  preparation  where  positive  transfer 
seems  the  rule  (Ayres,  Moore,  k  Vigorito,  1984).  Such  positive  transfer  does  not  disprove 
the  model  because  it  is  predicted  whenever  the  initial  value  of  a  is  relatively  small  and  the 
US  in  the  first  phase  of  training  produces  a  relatively  high  level  of  V .  Negative  transfer  is 
the  surprising  result;  it  is  predicted  whenever  the  first-phase  US  yields  only  a  low  level  of 
V  and  the  initial  value  of  a  is  large  (see  Ayres  et  al,  1984,  for  elaboration  of  this  point). 
Initial  values  of  o  presumably  depend  on  generalization  from  other  similar  stimuli  outside 
the  training  context  (Pearce  k  Hall,  1980,  page  538). 

In  addition  to  predicting  NT,  the  P-H  model  provides  a  mechanism  for  conditioned 
inhibition,  thereby  filling  a  void  in  previous  attention  theories.  It  was  only  after  their  model 
first  appeared  that  Moore  and  Stickney  (1982;  1985)  incorporated  conditioned  inhibition 
into  their  Mackintosh-type  model. 

We  now  summarize  the  P-H  model  as  we  understand  it;  (a)  The  excitatory  component 
of  the  associative  relationship  between  a  CS  and  US,  V ,  increases  whenever  the  CS’s 
associability,  a,  is  greater  than  zero  and  the  intensity  of  the  US  on  a  given  trial,  A,  is 
larger  than  that  predicted  by  all  the  CSs  present  on  that  trial;  V  never  decrea.ses.  (b)  The 
inhibitory  associative  component,  N ,  increases  whenever  a  is  greater  than  zero  and  the 
intensity  of  the  US  on  a  given  trial  is  less  than  or  equal  to  that  predicted  by  all  the  CSs 


present  on  that  trial.  Like  V ,  yV  for  a  given  CS  ran  never  decrease,  (r)  Net  associative 
strength  equals  the  sum  of  the  differences  between  the  V's  and  Nh  of  all  the  (\Sh  present  on 
the  trial,  (V^  -  N).  We  denote  this  net  associative  value  as  V',  in  keeping  with  the  notation 
used  in  the  M-S  model;  it  is  the  net  prediction  of  a  US  equal  to  A  by  all  stimuli  present 
on  a  given  trial,  (d)  a  for  computations  of  V  and  N  for  all  CSs  present  on  a  given  trial 
equals  the  absolute  value  of  the  difference  between  A  of  the  preceding  trial  and  IZ  of 
the  current  trial,  (e)  Letting  o  equal  the  geometric  mean  of  the  os  for  previous  trials  is  a 
permissible  option  under  the  model  (see,  e.g.,  Kaye  ft  Pearce,  1984). 

Stated  formally,  whenever  the  intensity  of  the  US  presented  on  a  given  trial  is  larger 
than  the  sum  of  the  Us  of  all  CSs  present  on  that  trial,  V'  increases  according  to 

AV;  =  ,So.A  (13) 

Si  is  the  salience  of  the  CS,,  o,  is  its  a.ssociability,  and  A  is  the  intensity  of  the  US  (5,,  o,, 
and  A  >  0).  Whenever  the  intensity  of  the  US  presented  on  a  given  trial  is  less  than  the 
sum  of  Us  of  all  CSs  present  on  that  trial,  N  increases  according  to 

AN.  =  5.0.  A  (14) 

A  =  ^  Uj  -  A  where  JZ  Uy  is  the  sum  of  net  associative  values  of  all  CSs,  including  CS,- 
acting  on  Trial  «  -  1.  Equations  13  and  14  imply  that  when  the  intensity  of  the  reinforcer 
is  less  than  V',,  V',  does  not  decrease,  but  rather  N,  increases  until  it  reaches  the  same  value 
as  U,.  When  the  sum  of  V's  of  all  CSs  equals  A,  Equation  14  rather  than  Equation  13 
should  be  applied  in  order  that  U,  remains  unchanged. 

The  value  of  o,  on  Trial  ri  for  all  CSs  acting  on  Trial  n  -  1,  is  given  by 

a7=|A'-‘-^V'"-l  (15) 

The  expression  JZV'y  represents  the  prediction  of  the  l^S  by  all  CSs,  including  CS, 
present  on  Trial  n  -  1.  When  the  reinforcer  is  accurately  predicted  by  all  the  CSs  present 
on  Trial  n  —  1,  o,  becomes  zero  on  Trial  n.  FAiuation  15  cannot  be  used  to  determine  q, 
on  the  first  trial  in  which  CS.  is  presented,  and  some  initial  value  must  be  assigned.  When 
CS,  is  presented  on  a  second  occasion,  however,  a,  is  determined  by  fxjuation  15 

Simulations  with  the  P-H  Model:  Partial  Reinforcement 

Pearceet  al  (1982)  point  out  that  the  original  model  has  difficulty  describing  acquisition 
under  partial  reinforcement.  For  example,  no  conditioning  is  prixlicted  when  reinforced 
and  nonreinforced  trials  are  alternated  if  the  sequence  begins  with  a  nonreinforced  trial. 
In  this  case,  on  Trial  1  both  A  and  Uj  equal  zero,  and  therefore  a,  on  Trial  2  is  zero.  On 


Trial  2,  when  the  US  is  presented,  A  =  1,  but  since  a,-  =  0,  no  increase  in  V  can  occur. 
In  order  to  solve  this  problem,  Pearce  et  al  (1982)  suggested  using  the  geometric  mean  of 
values  of  a,  computed  on  previous  trials: 

a?  =  T  I  A»-»  -  ^  V/"'  I  +(1  -  7)  ar*  (16) 

The  parameter  7  is  between  0  and  1.  Exiuation  16  yields  E)quation  15  when  7=1. 

Problems  for  the  model  under  partial  reinforcement  are  not  entirely  corrected  by  Equa¬ 
tion  16,  as  illustrated  in  Fig.  5.  Figure  5  shows  simulated  net  associative  values  {V)  for  a 
single  CS  as  a  function  50%  and  80%  randomly  reinforced  trials.  Notice  that  the  asymp¬ 
totic  value  of  V  with  50%  is  higher  than  that  predicted  with  80%  reinforcement,  and  that 
the  asymptotic  value  of  V  with  80%  reinforcement  is  higher  than  that  obtained  with  100% 
reinforcement.  (Starting  values  of  P,  and  N,  were  0;  a,=  1  initially,  7  =  .5  or  1,  A  =  1, 
and  the  initial  value  of  Si  =1.)  In  attempting  to  rearrange  these  asymptotic  levels  so  that 
higher  levels  of  conditioning  correspond  to  higher  percentages  of  reinforcement,  Pearce  et  al 
(1982)  introduced  additional  rate  parameters  into  Exjuations  13  and  14.  These  parameters 
are  denoted  Pe  and  0/  for  changes  in  excitatory  and  inhibitory  association,  respectively, 
and  they  are  bounded  between  0  and  1.  Exjuation  13  becomes 

AV;  =  5.a,^£;A  (17) 

and  Exiuation  14  becomes 

ANi-Sitti^fX  (18) 

Using  computer  simulations,  Pearce  et  al.  (1982)  showed  that  when  Pe  <  Pi  the  growth 
of  V  with  50%  reinforcement  reaches  an  asymptote  lower  than  that  attained  with  100% 
reinforcement.  Ex)uations  17  and  18  yield  a  lower  asymptote  for  V  because  on  reinforced 
trials  V  increases  less  than  N  does  on  nonreinforced  trials.  Our  simulations  confirm  this 
point.  However,  Fig.  5  suggests  that  when  growth  of  V  with  80%  reinforcement  is  adjusted 
to  leveb  close  to  those  predicted  for  100%  reinforcement,  by  adjusting  Pe  and  pi,  the 
asymptotic  value  of  V  attained  with  50%  reinforcement  is  too  low.  This  is  so  because 
with  80%  reinforcement  Pe  needs  to  be  much  smaller  than  Pi,  and  this  combination  of  Ps 
does  not  allow  V  to  grow  enough  with  50%  reinforcement.  Therefore,  the  introduction  of 
additional  rate  parameters,  as  proposed  by  Pearce  et  al.  (1982)  does  not  yield  appropriate 
asymptotes  for  V  with  different  percentages  of  reinforcement. 

Alternative  Forms  of  the  Pearce  and  Hall  Model 

In  order  to  improve  the  P-H  model’s  rendering  of  partial  reinforcement,  we  considered 
two  alternative  forms  of  the  model.  The  first  computes  a  on  the  basis  of  the  outcome  of 
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Trial  n  instead  of  the  outcome  of  Trial  n  —  1  as  in  Exiuation  15.  That  is,  A"~‘  in  Ek}uation 
15  is  replaced  by  A": 

a?=|A"-£v/''l  (19) 

Ekiuations  13  and  14  remain  unchanged.  Stated  somewhat  anthropomorphically,  E)quation 
19  implies  that  the  subject  waits  for  the  outcome  of  a  trial  before  deciding  by  what  amount 
to  increment  V  by  Exjuation  13  or  N  by  E)quation  14.  The  general  form  of  o,-  in  the  revised 
model  is  given  by  the  expression 

o?  =  7|A"-j:V';“‘'l+(l-7)or'  (20) 

Unrestricted  use  of  Equation  20  does  not  correct  the  prediction  of  higher  asymptotic 
levels  of  responding  under  partial  reinforcement  than  under  100%  reinforcement.  The 
desired  result  necessitates  that  a  be  computed  with  either  E}quation  20  (7  <  1)  or  Ekjuation 
19  (7  =  1),  whichever  yields  the  smaller  value  of  ct  .  Without  this  restriction  the  model 
predicts  higher  V  with  80%  reinforcement  than  with  100%,  as  did  the  original  P-H  model. 
Figure  6  shows  the  simulated  V  for  a  single  CS  as  a  function  of  50%  and  80%  reinforced 
trials,  using  Elquation  20  with  7  =  1  and  .5.  In  the  latter  case  (7  =  .5)  the  above  mentioned 
restriction  was  applied.  With  either  7,  this  restricted-o-version  of  the  P-H  model  yields 
asymptotic  levels  of  responding  for  50%  and  80%  reinforcement  that  are  (a)  lower  than 
that  predicted  for  100%  reinforcement,  (b)  sufficiently  high,  and  (c)  in  the  appropriate 
order. 

In  the  second  alternative  form  of  the  model.  Equations  13  and  14  are  replaced  by  a 
single  equation  expressing  the  changes  in  V  instead  of  separate  changes  of  V  and  N. 

AV;  =  5,q,(A"-^V/'’)  (21) 

Exjuation  21  implies  that  V  converges  to  A,  increasing  when  A  increases,  and  decreasing 
when  A  decreases.  Equation  21  may  be  regarded  as  the  R-W  model  with  the  addition 
of  a  modifiable  associability  term.  It  is  similar  to  the  expression  proposed  by  Wagner 
(1978)  to  encompass  CS  preexposure  effects  within  the  framework  of  the  R-W  model.  The 
expressions  for  changes  in  o  are  the  same  as  in  the  original  model,  i.e.,  Exjuations  15  or  16 
apply. 

Figure  7  shows  the  simulated  U  of  a  single  CS  as  a  function  of  50%  and  80%  reinforced 
trials  using  Equation  21  and  with  both  rules  for  computing  o.  As  in  the  case  of  the 
first  alternative  version  of  the  model,  Ex]uation  21  predicts  asymptotic  levels  for  50%  and 
80%  reinforcement  that  are  (a)  lower  than  that  predicted  for  100%  reinforcement,  (b) 
sufficiently  high,  and  (c)  in  the  appropriate  order.  However,  Ek|uation  16  yields  a  higher 
level  of  responding  than  Equation  20  with  80%  reinforcement. 

Figure  8  summarizes  the  predictions  made  by  the  original  and  two  alternative  versions 
of  the  model  for  a  wide  range  of  percentages  of  reinforcement.  Rates  of  reinforcement  in 
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the  10%  to  50%  range  were  obtained  by  introducing  the  required  number  nonreinforced 
trials  between  two  reinforced  trials.  Rates  of  reinforcement  in  the  66%  to  90%  range  were 
obtained  by  introducing  the  required  number  of  reinforced  trials  between  two  nonreinforced 
trials.  When  the  original  P-H  model  (Equation  16)  is  applied  with  Pe  =  0i  =  1)  any 
rate  of  reinforcement  from  50%  and  over  exceeds  the  asymptotic  level  obtained  with  100% 
reinforcement.  When  /?£  =  .015  and  /?/=  .10,  only  rates  of  reinforcement  of  90%  or  more 
achieve  sufficiently  high  asymptotes;  asymptotes  with  lower  reinforcement  probability  are 
too  low  to  agree  with  empirical  expectations.  The  two  alternative  versions  of  the  model 
yield  asymptotic  levels  of  net  associative  strength  that  are  more  realistic.  In  these  cases 
the  relationship  between  asymptotic  associative  strength  and  reinforcement  probability  is 
in  closer  agreement  with  empirical  expectations,  tending  to  lie  on  a  line  with  slope  equal 
to  1. 

Experiments  in  pigeon  autoshaping  have  shown  that  partial  reinforcement  can  produce 
higher  levels  of  responding  than  continuous  reinforcement  (Gibbon,  Farrell,  Locurto,  Dun¬ 
can,  k  Terrace,  1980).  Gibbon  et  al.’s  results  show  that  response  rate  monotonically  de¬ 
creases  with  increasing  probabilities  of  reinforcement.  Neither  the  original  nor  the  revised 
versions  of  the  P-H  model  can  account  for  this  phenomenon.  Simulations  with  the  original 
P-H  model  with  =  .1  show  that  V  first  increases  and  then  remains  constant  with 

increasing  rates  of  reinforcement  (Fig.  8).  Simulations  with  the  revised  models  show  that 
V  increases  with  increasing  rates  of  reinforcement  (Fig.  8).  According  to  Gibbon  et  al. 
(1980)  the  effect  of  partial  reinforcement  on  response  rate  parallels  the  well-established 
effect  of  partial  reinforcement  on  instrumental  learning,  and  might  be  explained  in  terms 
of  the  frustration  generated  by  nonreinforced  trials  (Amsel,  1962). 

In  addition  to  the  problem  of  inappropriate  asymptotic  levels  of  net  associative  strength 
under  partial  reinforcement,  the  original  P-H  model  is  severely  path  dependent.  That  is, 
terminal  levels  of  V  depend  on  the  sequential  pattern  of  reinforced  and  nonreinforced  trials. 
Path  dependency  is  a  concern  only  in  tasks  in  which  the  asymptotic  level  of  conditioned 
responding  is  known  to  be  sensitive  to  the  percentage  of  trials  that  are  reinforced  but 
relatively  insensitive  to  the  sequential  structure  that  underlies  that  percentage.  In  the 
case  of  classical  aversive  conditioning,  such  as  the  eye  blink  in  humans  and  rabbits,  for 
example,  asymptotic  performance  levels  are  not  particularly  sensitive  to  the  sequential 
properties  of  trials  (except  see,  Hoehler  ft  Leonard,  1973). 

Path  dependency  of  the  original  and  alternative  versions  of  the  P-H  model  is  contrasted 
in  Table  1.  The  entries  are  the  average  V  on  the  last  10  trials  following  300  trials  of 
patterned  50%  reinforcement  in  which  runs  of  reinforced  trials  were  alternated  with  equally 
long  runs  of  nonreinforced  trials.  Initial  parameterization  was  the  same  as  in  Fig.  8. 
Equation  16  of  the  original  model  yields  the  greater  path  dependence,  as  indexed  by  the 
range  of  entries  under  the  first  column  (.11).  The  revised  forms  of  the  model  (Equations 
13  and  14,  with  a  computed  with  Equation  20,  and  Equation  21,  with  a  computed  with 


either  Equation  16  or  Equation  20)  reduce  this  range  to  .03  and  .02,  respectively.  Thus, 
path  dependence  under  partial  reinforcement  is  substantially  reduced  by  using  either  of 
the  alternative  forms  of  the  model  instead  of  the  original. 

Implicationa  for  Latent  Inhibition  and  Negative  Transfer 

As  indicated  above,  the  original  P-H  model  (Exjuation  16)  predicts  both  LI  and  NT. 
With  7  =  1  in  Eiquation  16,  LI  occurs  with  a  single  CS  preexposure.  Retarded  acquisition 
is  predicted  because  CS  presentation  in  the  absence  of  the  US  produce  zero  associability, 
thereby  preventing  any  increase  in  V  on  the  first  reinforced  trial.  By  contrast,  NT  requires 
that  a  sufficient  number  of  CS-US  pairings  have  occurred  to  decrease  a  to  zero.  Conditions 
leading  to  NT  are  (a)  low  A  in  the  first  phase  of  training,  (b)  high  initial  a,  and  (c)  a  low 
value  of  7.  Positive  transfer  is  likely  if  any  one  of  these  conditions  is  not  satisfied.  When 
7  =  .5  in  Equation  16,  both  LI  and  NT  require  more  than  a  single  CS  presentation  prior 
to  acquisition  with  a  strong  US  to  refleetthe  effect  of  Stage  1  trials.  In  both  instances 
retarded  acquisition  with  a  strong  US  in  the  second  phase  of  training  comes  about  be¬ 
cause  CS  presentations  in  the  first  phase  cause  a  to  decrease  to  zero,  thereby  producing  a 
comparatively  small  average  a  during  early  reinforced  trials. 

Table  2  shows  V  on  the  first  trial  following  Stage-1  latent  inhibition  (LI)  and  negative 
transfer  (NT)  paradigms  as  predicted  by  the  original  and  alternative  forms  of  the  model. 
The  original  version  of  the  model  (P-H  {n  —  I)),  with  a  defined  according  to  Equation 
16,  yields  LI  and  NT  with  7  =  .5  or  1.  Table  2  shows  that  the  first  alternative  version 
of  the  model  (P-H  (n)),  with  o  defined  according  to  Equation  20,  yields  LI  and  NT  only 
when  7  =  .5.  As  in  the  original  model,  retarded  acquisition  in  the  second  phase  of  LI  and 
NT  results  from  reduced  a  on  early  reinforced  trials.  Both  LI  and  NT  reflect  the  number 
of  CS  presentations  on  the  first  stage  of  training.  The  behavior  of  the  second  alternative 
form  of  the  model  (P-Hk-*v(w  —  l))-^^^,  with  V  computed  by  E/]uation  21,  depends  on 
whether  a  is  computed  with  E^quation  16  or  E^quation  20.  With  E}quation  16,  LI  and  NT 
are  predicted  with  7  =  .5  or  I.  With  Equation  20,  LI  and  NT  are  predicted  only  when  7 
=  .5. 

One-TVial  Blocking 

Unlike  the  R-W  model,  the  models  considered  here  do  not  allow  for  blocking  on  the 
first  compound-CS  trial  following  Stage-1  training  to  a  single  CS.  The  question  of  whether 
blocking  occurs  on  the  first  Stage-2  trial  has  been  the  focal  point  of  experimental  efforts 
determine  which  type  of  theory  is  to  be  preferred.  Until  recently,  most  available  evidence 
suggested  that  blocking  requires  at  least  two  Stage-2  trials  to  occur  (e.g.,  Mackintosh, 
Dickinson,  t  Cotton,  1980).  More  recent  evidence  on  the  question  suggests  that  one-trial 


blocking,  as  anticipated  by  the  R-W  model,  can  occur  under  some  circumstances  (Baiaz, 
Kasprow,  It  Miller,  1982;  Dickinson,  Nicholas,  k  Mackintosh,  1983). 

The  original  P-H  model  does  not  permit  one-trial  blocking,  and  in  this  respect  it 
resembles  the  M-S  model.  Using  Ex)uation  16  to  compute  a,  one-trial  blocking  is  not 
possible  because  at  least  one  compound-CS  trial  is  necessary  in  order  to  reduce  the  added 
CS’s  initial  value  of  a  to  the  near-zero  value  implied  by  the  presence  of  the  previously 
conditioned  CS.  However,  if  a  is  computed  by  Equation  20,  one-trial  blocking  is  possible 
because  the  one-trial  delay  does  not  arise.  Equation  21  allows  for  one-trial  blocking, 
independent  of  the  expression  used  to  compute  a,  for  the  same  reason  that  the  the  R-W 
model  predicts  one-trial  blocking,  i.e.,  because  the  expression  (A  -  J]  Vj)  can  be  near  0  on 
the  first  compound  trial  provided  52  Vj  of  the  Stage- 1  CS  is  near  A. 

General  Discussion 

The  M-S  and  P-H  models  fail  to  provide  acceptable  renderings  of  acquisition  under 
partial  reinforcement  and  certain  other  phenomena.  Modifications  of  these  models  allevi¬ 
ate  these  shortcomings  while  retaining  their  basic  assumptions  and  predictive  power  (see 
Schmajuk  k  Moore,  1985).  The  revised  version  of  the  M-S  model  presented  here  gives 
improved  predictions  for  extinction,  reacquisition,  partial  reinforcement,  and  simple  two- 
CS  differential  conditioning.  Regarding  the  P-H  model,  the  two  approaches  to  improved 
performance  were  considered.  Both  provide  reasonably  good  renderings  of  acquisition  un¬ 
der  partial  reinforcement,  with  appropriate  asymptotic  levels  of  net  associative  value  and 
suppression  of  path  dependency.  In  addition,  both  alternative  forms  of  the  P-H  model  are 
able  to  predict  one-trial  blocking.  Because  it  involves  only  a  minor  change  in  the  com¬ 
putation  of  a  CS’s  associability,  the  first  alternative  form  of  the  P-H  model  more  closely 
resembles  the  original  than  does  the  second  alternative.  The  second  version  changes  the 
computation  of  excitatory  associative  value  so  as  to  place  the  P-H  model  into  the  same 
family  as  the  R-W  model  and  the  Sutton-Barto  model  (Sutton  k  Barto,  1981).  The  second 
version  is  interesting  because  the  discrepancy  between  the  actual  outcome  of  a  trial  and  the 
anticipated  outcome,  obtained  by  summing  the  predictions  of  all  CSs  present  on  that  trial, 
determines  both  a  and  asymptotic  values  of  net  associative  strength.  This  tactic  is  similar 
to  one  proposed  by  Frey  and  Sears  (1978)  in  which  the  attentional  variable  represents  the 
information  value  of  a  CS  in  terms  of  its  recent  associative  value,  and  it  allows  the  model 
to  predict  latent  inhibition.  In  order  to  predict  latent  inhibition  without  recourse  to  an 
attentional  variable,  Wagner  (1978)  proposed  that  changes  in  the  CS  effectiveness  might 
be  represented  in  the  R-W  model  by  the  inclusion  of  a  variable  reflecting  how  well  the  CS 
is  predicted  by  stimuli  that  precede  it. 

Our  suggested  revisions  of  the  two  models  might  be  challenged  as  being  entirely  ar¬ 
bitrary.  This  is  not  the  rase,  as  they  were  arrived  at  largely  through  a  process  of  trial 


and  error  in  which  various  remedial  approaches  were  implemented  into  a  wide  range  of 
simulation  protocols  that  included  simple  acquisition,  blocking,  conditioned  inhibition, 
differential  conditioning,  extinction,  latent  inhibition,  and  overshadowing.  This  process  of 
trial  and  error  emphasized  a  point  about  mathematical  models  that  is  often  overlooked  by 
their  detractors:  Mathematical  structure,  not  simply  number  of  parameters  or  degrees  of 
freedom,  dictate  the  descriptive  power  of  a  model. 

We  do  not  deny  the  possibility  that  other  modifications  will  be  discovered  that  prove 
preferable;  we  have  simply  not  discovered  any  that  retain  the  basic  mathematical  character 
of  the  originals  without  introducing  new  constructs.  Revisions  more  drastic  than  those 
considered  here  might  take  the  form  of  attentional  or  hybrid  models  that  possess  the  best 
features  of  the  M-S,  P  H,  and  R  W  models.  One  clue  that  this  might  be  possible  is 
suggested  in  the  revised  M-S  model,  in  which  the  threshold  for  triggering  antiassociations 
is  replaced  by  a  mechanism  very  similar  to  that  used  by  the  P-H  model  to  generate 
conditioned  inhibition.  Another  clue  is  suggested  in  the  second  revised  version  of  the 
P-H  model,  (Equation  21)  which  is  essentially  a  R-W  model  but  with  a  mechanism  for 
controlling  CS  associability.  In  this  respect  it  can  be  classed  with  the  Frey  and  Sears 
(1978)  model. 

Simulation  experiments  with  alternative  versions  of  the  M-S  and  P-H  models  should 
ultimately  point  the  way  to  further  refinements  and  better  specification  of  the  appropriate 
domains  for  each.  We  are  a  long  way  from  declaring  a  clear  preference  for  either  model,  and 
the  experimental  literature  suggests  that  each  may  have  its  place.  As  a  class,  P  H  models 
may  be  most  appropriate  for  characterizing  events  in  the  domain  of  conditioned  suppression 
and  perhaps,  more  generally,  in  systems  involving  autonomic-like  processes  of  arousal  and 
orienting  (Kaye  4  Pearce,  1984).  The  revised  M-S  model  may  be  more  appropriate  in  the 
domain  of  discrete  skeletal  responses  such  as  the  rabbit  NM  response  (Ayres  et  al,  1984). 
Wherever  further  explorations  of  these  models  may  lead,  we  believe  our  approach  illustrates 
some  of  the  benefits  to  theory  construction  and  assessment  to  be  derived  through  simulation 
experiments  over  a  broad  range  of  training  scenarios.  Comparisons  among  competing 
theories  can  be  sharpened  without  recourse  to  experimentation.  Although  real  experiments 
always  have  a  place  in  choosing  among  competing  theories,  simulation  experiments  can 
guide  decisions  regarding  protocols  that  are  most  likely  to  resolve  such  choices. 
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TABLE  1.  Pearce-Hall  Rules  for  Computing  Aasociability:  Average  Net 
Associative  Value  on  the  last  10  TVials  following  300  TViab  Consisting  of 
Alternating  Runs  of  Reinforced  and  Nonreinforced  IVials 


Sequences 

Algorithm 

P-H  in  -  1) 

P-H(n) 

P~H/t_»y(n  —  1) 

P-H/f-tvfn) 

10 

.08 

.66 

.50 

.50 

1100 

.10 

.65 

.49 

.49 

111000 

.12 

.65 

.51 

.51 

11110000 

.17 

.65 

.51 

.51 

1111100000 

.19 

.63 

.50 

.50 

Range 

.11 

.03 

.02 

.02 

Note.  The  basic  sequence,  repeated  over  the  300  trials,  is  indicated  as  series  of  1  (reinforced)  and 
0  (nonreinforced)  trials.  P  H  (n  -  1)  refers  to  the  ori^nal  rule  for  computing  a  with  equation  16. 
P-H  (n)  refers  to  the  alternative  rule  for  computing  a  with  Ek]uation  20,  with  “y  =  1  when 
o"~‘  >  a".  P-HR_n  (n  -  1)  refers  to  the  alternative  rule  for  computing  V  with  Equation  21  and 
o  with  Equation  16.  P  H/j_tv(n)  refers  to  the  alternative  rule  for  computing  V  with  Exjuation  21 
and  o  with  Exquation  20. 


TABLE  2.  Pearce-Hall  Rules  for  Computing  Associability:  Net  Associative 
Value  After  One  Reinforced  Trial  Following  5  or  10  Trials  of  Either  CS-alone 
or  CS  paired  with  a  weak  US  in  Stage  1. 


Algorithm 

P  H  (n  -  1) 

P  H(n) 

P  HR_H^(n-l) 

P  HR_»v(n)  1 

Camma 

a 

.5 

D 

5 

1 

.5 

1 

.5 

Control 

.50 

.50 

50 

.50 

.50 

.50 

.50 

.50 

LI  (5) 

.00 

02 

.50 

.26 

00 

.03 

.50 

.26 

i.I  (10) 

.00 

.02 

50 

El 

.00 

.00 

50 

.25 

NT  (5) 

11 

12 

51 

.29 

.09 

.10 

.50 

.30 

NT  (10) 

.10 

10 

.52 

.30 

.09 

.50 

.29 

Note.  The  number  of  trials  on  the  first  phase  of  LI  or  NT  is  indicated  in  parenthesis.  Control 
groups  received  neither  f'S  preexposure  nor  CS  paired  with  a  weak  I'S  In  NT,  A  =  .1  during  the 
first  phase  of  training.  I’  H  (n  -  1)  refers  to  the  original  rule  for  computing  o  with  E^quation  16. 
P-H  (fi)  refers  to  the  alternative  rule  for  computing  o  with  Equation  20,  with  =  1  when 
o"~‘  >  oj*.  P-Hr-w  (o  -  1)  refers  to  the  alternative  rule  for  computing  with  Equation  21  and 
o  with  Equation  16.  P  Hr-h  (n)  refers  to  the  alternative  rule  for  computing  V  with  Ek|uation  21 
and  a  with  E/quation  20. 


Figure  Captions 


Figure  1.  Partial  reinforcement  with  the  Moore-Stickney  model:  V,  N,  and  Net 
associative  value  P  ,  as  a  function  of  trials  for  100%  and  50%  random  reinforcement.  V 
and  N  with  50%  reinforcement  are  plotted  separately. 

Figure  2.  Partial  reinforcement  with  the  modified  Moore-Stickney  model:  V,  N,  and 
Net  associative  value  V  ,  as  a  function  of  triab  for  100%  and  50%  random  reinforcement. 
V  and  N  with  50%  reinforcement  are  plotted  separately. 

Figure  3.  Extinction  with  the  original  and  revised  Moore-Stickney  model:  Net 
associative  value  V  as  a  function  of  trials. 

Figure  4.  Reacquisition  with  the  original  and  revised  Moore-Stickney  model.  Initial 
acquisition  (100%  reinforcement)  with  the  original  Moore-  Stickney  model  is  shown  in 
Figs.  1-2. 

Figure  5.  Partial  reinforcement  with  the  original  Pearce-Hall  model.  Net  associative 
value  as  a  function  of  trials  under  50%  (alternated  reinforced  and  nonreinforced  trials) 
and  80%  (four  reinforced  followed  by  one  nonreinforced  trial)  reinforcement.  (1): 

0E  —  Pi  "=  -5:  (2)  Pe  <  Pi-Pe  =  =  .5.  Dashed  line  indicates  asymptote  reached  with 

a  continuous  reinforced  schedule. 

Figure  6.  Partial  reinforcement  with  the  alternative  Pearce-Hall  model  using  E/]uation 
20.  Net  associative  value  P  as  a  function  of  trials  under  50%  and  80%  reinforcement.  (1): 
7=1;  (2)7  =  .5. 

Figure  7.  Partial  reinforcement  with  the  alternative  Pearce-Hall  model  using  Ekjuations 
21  and  22.  Net  associative  value  P  as  a  function  of  trials  under  50%  and  80% 
reinforcement.  (1):  Exjuation  21  ;  (2):  E/)uation  22. 

Figure  8.  Partial  reinforcement  with  the  original  and  alternative  Pearce-Hall  models. 
Net  associative  value  as  a  function  of  percentage  of  reinforcement:  P-H,  n  —  I,  Pe  <  Pi, 
refers  to  original  rule  for  computing  a  using  Ekjuation  16  with  7=  .5;  P-H,  n  —  1, 

Pe  =  Pi,  refers  to  the  original  rule  for  computing  a  using  Equation  16  with  7=  .5;  P-H, 
”,  Pe  =  Pi,  refers  to  the  alternative  rule  for  computing  a  using  Equation  20  with  7  =  5; 
P-Hfi-iv  n  -  \,  Pe  =  Pi,  refers  to  the  alternative  rule  for  computing  V  using  Equation  21 
and  with  o  computed  using  Equation  20. 
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